Aiming at the problems of detail information loss and low segmentation accuracy in the segmentation of day and night ground-based cloud images, a segmentation network called CloudResNet-UNetwork (CloudRes-UNet) for day and night ground-based cloud images based on improved Res-UNet (Residual network-UNetwork) was proposed, in which the overall network structure of encoder-decoder was adopted. Firstly, ResNet50 was used by the encoder to extract features to enhance the feature extraction ability. Then, a Multi-Stage feature extraction (Multi-Stage) module was designed, which combined three techniques of group convolution, dilated convolution and channel shuffle to obtain high-intensity semantic information. Secondly, Efficient Channel Attention Network (ECA?Net) module was added to focus on the important information in the channel dimension, strengthen the attention to the cloud region in the ground-based cloud image, and improve the segmentation accuracy. Finally, bilinear interpolation was used by the decoder to upsample the features, which improved the clarity of the segmented image and reduced the loss of object and position information. The experimental results show that, compared with the state-of-the-art ground-based cloud image segmentation network Cloud-UNetwork (Cloud-UNet) based on deep learning, the segmentation accuracy of CloudRes-UNet on the day and night ground-based cloud image segmentation dataset is increased by 1.5 percentage points, and the Mean Intersection over Union (MIoU) is increased by 1.4 percentage points, which indicates that CloudRes-UNet obtains cloud information more accurately. It has positive significance for weather forecast, climate research, photovoltaic power generation and so on.
Concerning the current lack of effective development and deployment tools for deep learning applications, a component-based development framework for deep learning applications was proposed. The framework splits functions according to the type of resource consumption, uses a review-guided resource allocation scheme for bottleneck elimination, and uses a step-by-step boxing scheme for function placement that takes into account high CPU utilization and low memory overhead. The real-time license plate number detection application developed based on this framework achieved 82% GPU utilization in throughput-first mode, 0.73 s average application latency in latency-first mode, and 68.8% average CPU utilization in three modes (throughput-first mode, latency-first mode, and balanced throughput/latency mode). The experimental results show that based on this framework, a balanced configuration of hardware throughput and application latency can be performed to efficiently utilize the computing resources of the platform in the throughput-first mode and meet the low latency requirements of the applications in the latency-first mode. Compared with MediaPipe, the use of this framework enabled ultra-real-time multi-person pose estimation application development, and the detection frame rate of the application was improved by up to 1 077%. The experimental results show that the framework is an effective solution for deep learning application development and deployment on CPU-GPU heterogeneous servers.
Multi-modal abstractive summarization is commonly based on the Sequence-to-Sequence (Seq2Seq) framework, and the objective function optimizes the model at the character level, which searches locally optimal results to generate words and ignores the global semantic information of the summary samples. It may cause a problem of semantic deviation between the summary and multimodal information, resulting in factual errors. In order to solve the above problems, a multi-modal summarization model based on semantic relevance analysis was proposed. Firstly, the summary generator based on Seq2Seq framework was trained to generate candidate summaries with semantic multiplicity. Secondly, a summary evaluator based on semantic relevance analysis was applied to learn the semantic differences among candidate summaries and the evaluation mode of ROUGE (Recall-Oriented Understudy for Gisting Evaluation) from a global perspective, so that the model could be optimized at the level of summary samples. Finally, the summary evaluator was used to carry out reference-free evaluation of the candidate summaries, making the finally selected summary sample as similar as possible to the source text in semantic space. Experiments on benchmark dataset MMSS show that the proposed model can improve the evaluation indexes of ROUGE-1, ROUGE-2 and ROUGE-L by 3.17, 1.21 and 2.24 percentage points respectively compared with the current optimal MPMSE (Multimodal Pointer-generator via Multimodal Selective Encoding) model.
Internet of Vehicles (IoV) traffic monitoring requires the transmission, storage and analysis of private data of users, making the security guarantee of private data particularly crucial. However, traditional security solutions are often hard to guarantee real-time computing and data security at the same time. To address the above issue, security protocols, including two initialization protocols and a periodic reporting protocol, were designed, and a Software Guard Extensions (SGX)-based IoV traffic monitoring Secure Data Processing Framework (SDPF) was built. In SDPF, the trusted hardware was used to enable the plaintext computation of private data in Road Side Unit (RSU), and efficient operation and privacy protection of the framework were ensured through security protocols and hybrid encryption scheme. Security analysis shows that SDPF is resistant to eavesdropping, tampering, replay, impersonation, rollback, and other attacks. Experiment results show that all computational operations of SDPF are at millisecond level, specifically, all data processing overhead of a single vehicle is less than 1 millisecond. Compared with PFCF (Privacy-preserving Fog Computing Framework for vehicular crowdsensing networks) based on fog computing and PPVF (Privacy-preserving Protocol for Vehicle Feedback in cloud-assisted Vehicular Ad hoc NETwork (VANET)) based on homomorphic encryption, SDPF has the security design more comprehensive: the message length of a single session is reduced by more than 90%, and the computational cost is reduced by at least 16.38%.
Focused on the head pose flipping and the implicit spatial cues missing between image features when reconstructing human body from monocular images, a three-dimensional human reconstruction model based on High-Resolution Net (HRNet) and Graph Convolutional Network (GCN) was proposed. Firstly, the rich human feature information was extracted from the original image by using HRNet and residual blocks as the backbone network. Then, the accurate spatial feature representation was obtained by using GCN to capture the implicit spatial cues. Finally, the parameters of Skinned Multi-Person Linear model (SMPL) were predicted by using the features, thereby obtaining more accurate reconstruction results. At the same time, to effectively solve the problem of human head pose flipping, the joint points of SMPL were redefined and the definition of the head joint points were added on the basis of the original joints. Experimental results show that this model can exactly reconstruct the three-dimensional human body. The reconstruction accuracy of this model on the 2D dataset LSP reaches 92.41%, and the joint error and reconstruction error of the model are greatly reduced on the 3D dataset MPI-INF-3DHP with the average of only 97.73 mm and 64.63 mm respectively, verifying the effectiveness of the proposed model in the field of human reconstruction.
By analyzing the problem of job performance interference in distributed machine learning, it is found that performance interference is caused by the uneven allocation of GPU resources such as memory overload and bandwidth competition, and to this end, a mechanism for quickly predicting performance interference between jobs was designed and implemented, which can adaptively predict the degree of job interference according to the given GPU parameters and job types. First, the GPU parameters and interference rates during the operation of distributed machine learning jobs were obtained through experiments, and the influences of various parameters on performance interference were analyzed. Second, some GPU parameter-interference rate models were established by using multiple prediction technologies to analyze the job interference rate errors. Finally, an adaptive job interference rate prediction algorithm was proposed to automatically select the prediction model with the smallest error for a given equipment environment and job set to predict the job interference rates quickly and accurately. By selecting five commonly used neural network tasks, experiments were designed on two GPU devices and the results were analyzed. The results show that the proposed Adaptive Interference Prediction (AIP) mechanism can quickly complete the selection of prediction model and the performance interference prediction without providing any pre-assumed information, it has comsumption time less than 300 s and achieves prediction error rate in the range of 2% to 13%, which can be applied to scenarios such as job scheduling and load balancing.
Label modeling is the basic task of label system construction and portrait construction. Traditional label modeling methods have problems such as difficulty in processing fuzzy labels, unreasonable label extraction, and ineffective integration of multi-modal entities and multi-dimensional relationships. Aiming at these problems, an enterprise profile construction method based on label layering and deepening modeling, called EPLLD (Enterprise Portrait of Label Layering and Deepening), was proposed. Firstly, the multi-characteristic information was extracted through multi-source information fusion, and the fuzzy labels of enterprises (such as labels in wholesale and retail industries that cannot fully summarize the characteristics of enterprises) were counted and screened. Secondly, the professional domain lexicon was established for feature expansion, and the BERT (Bidirectional Encoder Representation from Transformers) language model was combined for multi-feature extraction. Thirdly, Bi-directional Long Short-Term Memory (BiLSTM) was used to obtain fuzzy label deepening results. Finally, the keywords were extracted through TF-IDF (Term Frequency-Inverse Document Frequency), TextRank, and Latent Dirichlet Allocation (LDA) model to achieve label layering and deepening modeling. Experimental analysis on the same enterprise dataset shows that the precision of EPLLD in the fuzzy label deepening task is 91.11%, which is higher than those of 8 label processing methods such as BiLSTM+Attention and BERT+Deep CNN.
When using Immersed Boundary-Lattice Boltzmann Method (IB-LBM) to solve the flow field, in order to obtain more accurate results, a larger and denser flow field grid is often required, which results in a long time of simulation process. In order to improve the efficiency of the simulation, according to the characteristics of IB-LBM local calculation, combined with three different task scheduling methods in OpenMP, a parallel optimization method of IB-LBM was proposed. In the parallel optimization, three task scheduling modes were mixed to solve the load imbalance problem caused by single task scheduling. The structural decomposition was performed on IB-LBM, and the optimal scheduling mode of each structure part was tested. Based on the experimental results, the optimal scheduling combination mode was selected. At the same time, it could be concluded that the optimal combination is different under different thread counts. The optimization results were verified by speedup, and it could be concluded that when the number of threads is small, the speedup approaches the ideal state; when the number of threads is large, although the additional time consumption of developing and destroying threads affects the optimization of performance, the parallel performance of the model is still greatly improved. The flow field simulation results show that the accuracy of IB-LBM simulation of fluid-solid coupling problems is not affected after parallel optimization.
For the low security and capacity shortages of steganographic methods based on the single data, a new text steganography method with hierarchical security was proposed. First, multiple types of data in the whole cover document were regarded as optional steganographic covers to build up a hierarchical security steganographic model upon the the steganographic security levels defined by taking the characteristics of different types of data and the steganalysis as evaluation criterions. Then, a security level was adaptively determined by the secret message length, and the secret message was embedded into the selected independent different types of data in a cover document with the help of the built model. Theoretical analysis and experimental results show that compared with the steganography based on single data, the proposed method has expanded the steganographic capacity and reduced the modifications of the statistic characteristics of a single type of data in the cover document when the same secret message was embedded. In conclusion the proposed method improves the security of the secret message.
Because the Register Swapping (RS) method does not consider register allocation's effect in reducing soft error of register files, a static register reallocation approach was proposed concerning live variable's effect on soft error. First, this approach introduced live variable's weight to evaluate its impact on soft error of register files, then two rules were put forward to reallocate the live variable after the register swapping phase. This approach can reduce the soft error in the level of live variable further. The experiments and analysis show that this approach can reduce the soft error by 30% further than the RS method, which can enhance the register's reliability.
Evolutionary Algorithm based on State-space model (SEA) is a novel real-coded evolutionary algorithm, it has good optimization effects in engineering optimization problems. Global convergence of crossover SEA (SCEA) was studied to promote the theory and application research of SEA. The conclusion that SCEA is not global convergent was drawn. Modified Crossover Evolutionary Algorithm based on State-space Model (SMCEA) was presented by changing the comstruction way of state evolution matrix and introducing elastic search operation. SMCEA is global convergent was proved by homogeneous finite Markov chain. By using two test functions to experimental analysis, the results show that the SMCEA are improved substantially in such aspects as convergence rate, ability of reaching the optimal value and operation time. Then, the effectiveness of SMCEA is proved and that SMCEA is better than Genetic Algorithm (GA) and SCEA was concluded.
Evolutionary Algorithm based on State-space model (SEA) is a new evolutionary algorithm using real strings, and it has broad application prospects in engineering optimization problems. Global convergence of SEA was analyzed by homogeneous finite Markov chain to improve the theoretical system of SEA and promote the application research in engineering optimization problems of SEA. It was proved that SEA is not global convergent. Modified Elastic Evolutionary Algorithm based on State-space model (MESEA) was presented by limiting the value ranges of elements in state evolution matrix of SEA and introducing the elastic search. The analytical results show that search efficiency of SEA can be enhanced by introducing elastic search. The conclusion that MESEA is global convergent is drawn, and it provides theory basis for the application of algorithm in engineering optimization problems.